/T>6  VJ8- 


n  r>  -C 


•,  i  ?; 


!U1:  '■  “  '  . M  ? 

£ ? | f  '  .  if  I 

X  1  :y  :<•  j  •?  7 

1  n  i  WK  i>  «  ? 

,,  i 

jia^ejtbu  .  but 

c 


*17  CotKivi<.  v. 


i 

j 


THEMIS  SIGNAL  ANALYSIS  STATISTICS  RESEARCH  PROGRAM  '■'» 


CORRELATION  BETWEEN  TWO  VECTOR  VARIABLES 
by 

A.  M.  Kshirsagar 


—  u 


Technical  Report  No.  26 
Department,  of  Statistics  THEMIS  Contract 

March  4,  1969 


Research  sponsored  by  the  Office  of  Naval  Research 
Contract  N00014-68-A-0S15 
Project  NR  042-260 


Reproduction  in  whole  or  in  part  is  permitted 
for  any  purpose  cf  the  United  States  Government. 


Distribution  of  this  document  is  unlimited. 


DEPARTMENT  OF  STATISTICS 
Southern  Methodist  University 


CORRELATION  BETWEEN  TWO  VECTOR  VARIABLES 


'  /•-'?•  V/' 9 

.  <7  f 

r 


i' 


A.  M.  Kshirsagar* 
Southern  MtJiodist  University 
Dallas,  Texas  75222 


SUMMARY 


H,  Ruben  (1966)  has  suggested  a  simple  approximate  normalization  for 
the  correlation  coefficient  in  normal  samples,  by  representing  it  as  the 
ratio  of  a  linear  combination  of  a  standard  normal  variable  and  a  chi  variable 
to  an  independent  chi  variable  and  then  using  Fisher’s  approximation  to  a 
chi  variable.  This  result  is  extended  in  this  paper  to  a  matrix,  which  in 
a  sense  is  the  correlation  coefficient  between  two  vector  variables  x  and  v  . 
The  result  is  then  used  to  obtain  large  sample  null  and  non-null  (but  in  the 
linear  case)  distributions  of  the  Hotel ling-Lawley  criterion  and  the  Piliai 
criterion  in  multivariate  analysis,  Williams  (1955)  aid  Bartlett  (1951)  have 
derived  some  exact  tests  for  the  croodness  of  fit  of  a  single  hypothetical 

function  to  bring  out  adequately  the  entire  relationship)  between  two  vectors 

1 

3t  and  £  ,  by  factorizing  Wilks'  A  suitably.  These  factors  are  known  as 
"direction"  and  "rmllinearity"  factors,  as  they  refer  to  the  direction  and 
collinearity  aspects  of  the  null  hypothesis.  In  this  paper,  the  other  two 
criteria  viz.  the  Hotelling-Lawley  and  Piliai  criteria  are  partitioned  into 
direction  and  collinearity  parts  a.id  large  sample  tests  corresponding  to  them 
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are  derived  for  testing  the  goodness  of  fit  of  an  assigned  function. 
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1.  INTRODUCTION 


2 

It  r  is  the  sample  correlation  coefficient  between  x  and  y  ,  r  is 

the  ratio  of  the  regression  sum  of  squares  to  the  total  sum  of  squares  and 

2  2 

r  /(.’  -  r  )  is  tne  ratio  of  the  regi^sion  sum  of  squares  to  the  residual 

sum  of.  squares,  in  the  regression  of  x  on  y.  When  however,  we  consider  the 

regression  of  a  p  *  1  vector  x  on  a  q  *  1  vector  ,  (p  <_  q)  we  shall  obtain 

two  p  x  p  symmetric  matrices  corresponding  to  regression  of  x  on  and  the 

residual.  Let  these  be  denoted  by  B  and  A  respectively  so  that  A  r  B  is 

2  2  2 

the  "total"  matrix.  Matrix  generalizations  of  r  and  r  /(I  -  r  )  can  be 

obtained  from  B  ,  A  and  A  +  B  by  expressing  A  +  B  as  CC'  and  A  as  FF'  where 

C  and  F  are  lower  triangular  matrices.  Then  C  1RC'  1  can  be  looked  u<~.on 

as  a  generalization  of  r2  and  F  1BF,~1  of  r2/(l  -  r2)  .  Ruben  ,1%6)  expressed 
, — - 

f  «  r/vl  -  r* 


(f. 


PXn-l)/Xn-2 


where  f,  is  a  N(0  ,  1)  variable,  \  denotes  a  chi-variate  with  'a'  degrees  of 

<1 

freedom  {d  «f  •)  and  A.  ,  xn_j  >  Xn  ,  are  independent,  fl  is  the  population  para¬ 
meter.  /'  similar  representation  is  derived  in  this  paper  for  tne  mat-  j.x 
.jeneral ization  of  r  and  is  used  to  obtain  an  approximate  large  sample  nor¬ 
malization  of  this  matrix. 

Several  multivariate  problems  can  be  put.  into  the  framework  of  relationship 
betwci  ,.  two  vectors  x  and  y_  .  The  1  llowing  three  criteria  are  generally 
used  in  multivariate  analysis  to  test  lack  of  association  between  x  and  y  : 

(1)  Wilks’  A  ;  A  =  | A | / ] A  +  n( 

(2)  Piilai's  criterion  tr  B(A  +  it)  ^ 

(3)  Hotelling-Lawley  criterion  tr  BA  1 


2 


Large  sasqple  null  anJ  non-null  (linear  case)  distributions  of  the  last  two 


criteria  are  derived,  using  the  approximate  normalization  of  the  generalization 
of  r  and  further  a  suitable  partitioning  of  the  two  criteria,  analogous  to 
the  factorization  of  Wilk's  A  by  Bartlett  (1951),  for  testing  the  goodness 
of  fit  of  a  single  hypothetical  function  +  •••  +  apXp  1  derived. 

2.  MATRIX  GENERALIZATION  OF  f 
Let  the  variance-covariance  matrix  of  the  two  vectors 


be 


(2.1) 


and  let  the  matrix  of  corrected  si  '  of  squares  (s.s.)  aid  ..urn  of  products 
(s.p.)  of  observations  in  a  sample  on  these  variables  be 


S11 

Si  2 

S21 

S22 

p 

q 

J 


P 

q 


(2.2) 


This  is  based  on  n  d  f.  Then  we  have  the  following  matrices > 


Bq  ■  matrix  of  regression  coefficients  s12s22  '  of  x  on  y  (2.3) 

B  -  matrix  of  s.s.  fi  s.p.  due  to  regression  S12S22S21  (2'4) 

A  -  "residual"  s.s.  &  s.p.  matrix  -  S^2S22S2^  *  S^>2  (2.5) 

A  +  B  -  "total"  matrix  (2.6) 

-  3  - 


i 


The  corresponding  matrices  for  the  population  are: 

6  “  l12l2\  '  E12E22E21  '  hi- 2  ‘  E11  "  E12E22E21  “d  E11  respectively. 

If  x  and  have  a  normal  distribution,  S  will  have  a  Wishart  distribution 
and  from  that,  by  suitable  matrix  transformations,  it  can  be  shown  that 
Bq  ,  S^2  end  A  are  independently  distributed  as  below: 


(1)  Const,  e 

(2)  Const.  |s. 

i 

(3)  Const.  I A | 


(n-q-1) /2 


(n-q-p-l)/2 


0 

o 

CD 

(2.7) 

-1 

22  22' 

dS22 

(2,8) 

n.2*» 

dA 

(2.9) 

Thus  Bq  has  a  normal  distribution,  while  q  *  q  matrix  S22  has  a  Wishart 
distribution  with  n  -  q  d.f.  We  shall  denote  the  last  two  distributions 
(2.8)  and  (2.9)  by  wq (s22 1  ^22  ^ and  Wp E11 *2 ’ n_q*  *  Let  ®  '  n  '  M  '  F  ' 


C  ,  K  be  lower  triangular  matrices  such  chat  Z, 


’’  ,  ln,2  -  nn*  , 


S22  -  MM’  ,  A  *  FF*  ,  B  =  KK’  and  A  +  B  -  CC ' .  Define  further 


u  »  n  (bq  -  6)m 

V  - 

w  -  n_1F 

P  -  n“1E12«,_1 

R  -  F^S^M'*1 
R  -  C_1K 

-1  -1 

L  -  RR'  -  C  'BC' 


It  can  be  easily  seen  that  L  -  RR'  is  the  matrix  generalisation  of  r 

2  2 

and  RR*  is  the  matrix  generalization  of  r  /(I  -  r  )  .  Obaerva  that 


(2.10) 

(2.11) 

(2.12) 

(2.13) 

(2.14) 

(2.15) 

(2.16) 
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(2.17) 


(2.18) 
(2.19) 

If  we  transform  to  U  ,  V  and  W  from  bq  ,  and  A  respectively  in  (2.7;  , 

(2,8)  and  (2.9),  it  can  be  easily  seen  that 

(1)  (i  ■  1  ,  •••  ,  p;  j  =  1  ,  •••  ,  q)  ,  the  pq  variables 
in  U  are  independent  N(0  ,  1)  variables. 

(2)  v,.  (j  ■  1  ,  •••  ,  q)  are  independent  y  . , ,  variates  and 

3 J  n-i+l 

the  off-diagonal  elements  v  .  (k  >  j  ,  k  *  l  ,  • • •  ,  q 

K3 

j  ■  1  ,  • • •  ,  q)  of  the  lower  triangular  matrix  V  are 
independent  N(0  ,  1)  variables,  independent  of  v^  also. 

(3)  W  (i  -  1  ,  •••  ,  p) ,  the  diagonal  elements  of  the  lower 

triangular  matrix  w  are  independent  Xn  variates, 

while  W  (i  >k,i,k«l,***,p)  are  independent 

N  \0  ,  1)  variables,  independent  of  v.  also. 

Since  a  Wishart  distribution  is  the  multivariate  matrix  qenerali ration 

2 

of  a  x  distribution,  V  or  W  ,  which  ara  in  a  certain  sense  matrix  square 
roots  of  El-5,,  and  A  can  be  looked  upon  as  matrix  generalizations  of 

a  chi -variate .  This  is  further  supported  by  the  fact  that  the  diagonal 
elements  of  V  and  W  are  chi  variables.  Consequently  (2.18)  is  the  multivariate 
er&loguo  cf  Rubin's  representation 

f  -  x"\<5  +  ?Xn  J  .  <2-20> 

n-2  n-i 


u  +  fv  -  n~1s12M'"1 

where  P  is  the  population  matrix  corresponding  to  R  Hence 

R  »  vf^U  +  PV) 

and  RR'  *>  f’V"1 


stated  earlier 


5 


Ruben  uses  Fisher's  approximation  of  a  chi-variate  viz.  x  is  approx- 

3. 

imately  normal  with  mean  (a  -  1/2) and  variance  1/2  and  proves  that 

(2.21) 

(1  +  ?2/2  +  d2/2) 1/2 

is  approximately  N (0  ,  1)  .  This  is  a  fairly  good  approximation  for  all 
practical  purposes.  We  now  proceed  to  consider  a  similar  result  for  our 
R  .  Ruben  derived  (2.21)  by  equating  (2.20)  to  fQ  and  then  showing  that 
the  approximate  normal  variate 


has  mean 


and  variance 


w72.  -  (*§r«. 


1  +  1/2  (c2  +  i2Q) 

2  2 

He  then  replaces  i*  by  f  to  get  (2.21)  ,  We  employ  a  similar  procedure 
mechanically  with  the  h^pe  of  obtaining  a  suitable  approximation  to  the 
distribution  of  R  .  Consider  che  matrix 


c  -  U  ♦  PV  -  WRp  .  (2.22) 

where  £  -  [1^]  ,  RQ  "  ,  (i  ■*  1  t  *  •  •  ,  pj  j  ■  1  »  *  *  •  .  q)  • 

Using  Fisher's  approximation  of  a  x  variate  by  a  normal  variate,  for 


o 


and  ,  we  can  see  by  a  little  algebra  that  the  t;  are  normally  distributed 


and 


...  .  /2n-2j+lV/^.  /2n-2q-2i+l  V/2-0 

:(V  - 1— 2)  pij  -  (— 2)  rij 


(2.23) 


v(v 


02 


1  ♦  l  t2ik*  l  *kj 

k-j  k-1  1 


P?./2  - 


(2.24) 


Cov ( £  ^  ^  = 


0  i  t  i'  ,  j  ^  j' 


l 


1  kjrkj'  (lA)f.1rii,  - 


¥i'k  - 


(2.25) 


Following  Ruben's  argument  for  f  ,  we  expect 


t: 1—Y\ >  -  N^V 
1  *  j,  ^  *  j 


^  *  j  *  *  '  ' 


1/2 


(2. 2G) 


to  be  approximate  N(0  ,  1)  variable.  However,  on  account  of  (2.25),  ? 
are  not  independently  distributed.  For  large  n  ,  the  numerator  of  (2.2b) 
can  very  well  be  taken  as 

^  (fii  '  V  (- 

Jf  we  consider  the  null  case  viz.  P  -  0  ,  we  find  that  r^  and 
(i  f  i')  are  uncorrelated  and  so  »'n  P  can  be  approximately  regarded  as  a 
randoo  »a»pie  from  a  aulti variate  noml  distribution,  with  zero  means  and 


a  certain  covariance  matrix.  In  the  bivariate  case,  when  p  •  0  ,  we  have 
two  normal  approximations  available  to  us  for  large  n  viz.  /n  ?  ii  N(0  ,  1) 
and  the  other  one  is 

v ^  £•/  (1  +  f2)1/2  *  V'n  r  is  N(0  ,  1)  (2.28) 

The  corresponding  multivariate  generalizations  will  be 

(1)  )'n"  R  ii  a  matrix  of  independent  N  (0  ,  1)  variables  (2.29) 

in  large  samples,  and 

(2)  •/n  D  1R  is  a  matrix  of  independent  N’(0  ,  1)  variables.  (2.30) 
F«re  D  =  F  is  a  lower  triangular  matrix  and  sc 

DD1  =  F_1CC,F’~1 

=  F_1(A  +  B) F ' _1 

but  I  *  F_  AF'~  and  FJ3'  -  1  iBF'“ 
and  therefore  , 

,2 

DD'  -  I  +  RR1  ,  a  matrix  generalization  of  1  +  r  of  (2.20)  .  (2.31) 

We  shall  investigate  (b)  first.  If  (b)  is  true,  we  shall  expect 
the  p  *  p  matrix 

T  *  nD'RR'D'”  (2.32) 


to  have  the  distribution 

W  (filiq)dr  V  2 . 3  3  > 

P 
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for  large  n.  Now 


i 


J  S»w  •»***■”" 


-  r  *  C^FRR'F'C’”1 
n 

■  c'Hc'-1 


-  L  by  (2.16) 


(2.34) 


Whan  P  *  0  ,  B  has  the  VJ^_  ( B j  E  j  ]  j q )  distribution  and  A  has  an  independent 

jn-q)  distribution.  We  transform  _roro  A  and  B  to  C  and  T  by  (2.34) 
•snd  CC’  *  A  +  B  ,  integrate  out  C  and  find  that  tne  distribution  T  i 
(aee  Kshirsagar,  1961  a) 


Const . 


(q-p-D/2^ 


;r,'-C’-p~i)/2^r 


(2.35) 


But  as  n  +  •  , 

!t  1  ,,  j  (n-q-p-15/2  -1/2  tr  F 

,  I  -  —  i  j  --*■  e 


so  that,  in  large  samples,  F  has  the  Wishart  distribution 


Wp(r|ljq)dF  ,  (2.36) 

as  we  expected  in  {2,33),  if  (b)  is  trie.  This,  of  course,  is  not  a  proof 
of  (b)  but  it  supports  our  conjecture  about  the  large  sample  behaviour  of 
/n  n~1R  , 

As  regards  (a) ,  we  observe  that 

RR!  •»  F  8F '  and  A  »  FF’  (2.37) 

Transforming  from  A  and  B  to  F  and  R  ,  we  find  the  distribution  of 

ft  ■  nRR“  to  be 

Const. ! A |  '<«-»/» 1 1  .  i  aj-<n-o+p+l)/2M  ,j.38) 

-  9  - 


This,  as  n  +  *  ,  tends  to 


Const. jdj^  ^  '  *exp (-1/2  tr  A)dA 

or 

»MA!ljq)dA  , 


(2.39) 


as  it  should  if  (a)  is  true. 

So,  for  testing  the  null  hypothesis  P  **  0  oi  which  is  the  sane  a3 
El 2  *  0  ,  we  have  two  criteria 


A_1B 


and  tr  F  *  n  tr  (A  + 


S)  ~^B 


(2.40) 


Both  of  then  have  a  x*  distribution  with  pq  d.f.,  for  large  n.  Both  these 
criteria  are  well  known  in  literature,  tr  A  ^B  is  Hotelling (1951) -Lawley (1938) 
criterion  and  tr  (A  +  B)  is  Pillai's  criterion  (1955). 


3.  MON-NULL  DISTRIBUTIONS  OF  T  AND  A 
In  many  practical  situations  ^  is  a  vector  of  dummy  variables  repre 
senting  differences  among  q  +  1  groups  or  populations  and  one  is  interested 
in  constructing  discriminant  functions  for  these  groups.  In  this  case,  it 
i®  known  that  the  number  of  discriminant  functions  is  equal  to  the  number 
of  r,on~- .taro  true  canonical  correlations  between  x  and  y  .  In  particular,  if 
is  the  only  non-zero  true  canonical  correlation  and  ,  p3  ,  •••  , 
are  all  null,  the  group  means  are  collinear  and  only  ®e  discriminant  fun  tion 
is  adequate.  Thir  is  the  canonical  variate  corresponding  to  *  Suppose 

a'x  *  a, a,  +  •••  +  a  x  (3,1) 

11  pp 

is  an  assigned  cr  hypothetical  function  and  we  want  to  test  its  goodness 


-  10  - 


■m-. 


of  fit  for  discriminating  among  q  +  1  groups.  The  hypothesis  of  goodness 
of  fit  of  a'x  coe?>rises  of  two  parts: 

(I)  Direction  Aspects  Whether  a'x  agrees  with  the  true 
discriminant  function  viz.  the  canonical  variate 
corresponding  to  p^  and 

(II)  Collinearity  Aspect:  Whether  one  discriminant  function 
can  be  adequate  at  all  or  in  other  words,  whether  is 
the  only  non-zero  canonical  correlation. 

Bartlett  (1951)  and  Williams  (1955)  derived  exact  tests  based  on 
factorization  of  Wilks'  A  criterion,  j A j/ j A  +  b|  for  this  purpose.  Our 
aim  here  is  to  derive  similar  tests  for  (I)  and  (II)  based  on  the  other 
two  criteria  —  Hotelling (1951) -Lawlev (1938)  and  Pi.Uai  (1955).  For  this 
purpose,  we  shall  derive  the  non-null  distributions  of  r  and  of  A  ,  in 
the  linear’  case,  i.e..  the  case  where  a.  4  0  .  n  *  •••  a  a  =  n  .  Thic 

~  )  2  p 

is  called  linear  case  because  the  means  of  the  q  +  1  groups  are  collinear 
or  lie  in  a  space  of  1  dimension. 

Let  x*  „  be  the  vectors  of  the  true  (population)  canonical  variables 
and  let  the  relationship  between  x*  and  x  be 

2*  -  £  *  (3.2) 

where  6  is  a  p  *  p  non-singular  matrix,  x*  and  £*  have,  therefore,  I 
and  I  as  their  variance-covariance  matrices  respectively  and  except  for 

'4 

Pj  ,  the  correlation  between  x*  and  y*  ,  all  other  correlations  are  zero. 


11  - 


Define 


A*  =  SA<5‘  ,  B*  =  6B5  *  ,  C*C*’  «  A*  +  B*  , 
where  C*  is  a  lower  triangular  matrix. 


(3.3) 


Then,  the  distribution  of 


L*  -  [£*.,]  «*  C4'“1B*C*'"i  , 


(3.4) 


when  £*  is  fixed,  is  shown  to  be  (Kshirsagar,  1961a) 


Const. *(4^  f  px)  j  (q“P"1)/2|l  -  L*|  (n_<3"P“1>/2dL*  (3.5) 


where 


and 


4(4^  ,  p)  =  e 


-X2/2 


F 

11 


n  £  V 


2  '  2  11 


»2  -  »;  I  -  »i> 

r*l 


(3.6) 


(3.7) 


As  in  section  2,  for  large  n 


1 1  -  L* 


(n-q-p-1) /2 


can  be  replaced  by 


where 


exp{-  j  tr  r*| 

T*  -  nL* 


(3.8) 


(3.9) 


and  so,  r*  will  have  a  non-central  Wishart  distribution  of  the  linear  case 
(Anderson,  1946)  for  large  n  .  Make  a  further  transformation 

r*  «  nL*  -  S*S*  *  (3.10) 
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* 


where  S*  »  [S* ,]  is  a  lower  triangular  matrix.  Then  it  can  be  readily 
ij 

seen  that,  for  large  n  ,  S*2  is  a  non-central  x2  (non-centrality  para- 

IX 

meter  is  X2)  ,  S*2  is  a  x2  with  q  +  1  -  2  d.ff.  (i*2,  ,p)  ,  SjL 

(i  >  j  i  i  ,  j  »  2  ,  ,  p)  is  N(0  ,  1)  and  all  these  variables  are 

independent.  The  over-all  criterion  for  testing  '-he  independence  of  jc 
and  £  (which  in  this  case  means,  all  the  q  +  1  groups  have  identical  means) 
is,  as  seen  in  section  2,  tr  T  ,  which  is  the  same  as  tr  r*  on  account 
of  (3.3)  and 


tr  r*  r  (S*2)  + 


<sfl + 


+  S*2) 
pi 


p 

+  (  Z  E 

irj-2 

i>  j 


SJ2) 


(3.11) 


-  y1  +  y2  +  y3  say 


Then  Yj^  contains  the  entire  non-centrality;  y  ^  is  a  x2  with  p  -  1  d.f.  and 
Y3  is  a  x2  with  (p  -  1) (q  -  1)  d . ' 

Let 


then 


s  * 


K*  - 


C*  - 


sh 

0 

1 

•* 

S2j 

p-1 

X 

P-1 

K*K* ' 

,  K* 

is  lower 

[Kii 

0 

k* 

K2,. 

V* 

11 

** 

0 

c* 

(3.12) 


(3.13) 


(3.14) 


(3.15) 


S*  -  nC*_1K*  , 


(3.16) 
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and  after  a  little  algebra,  we  find  that 


1 


-Id  (s  *  B>V 


(3.1?) 


n  ^2 


-(1)B(A  *  B)  B-(l) 
— (1) B— (1) 


^(1)^(1) 


i'd)  (B  +  A)^(l) 


l  -l  iJuB(A  +  B)  B£m 

i  Yj  .  tr  B<*  .  B,  1  -  - OL 


— (1)  — (1) 


(3.18) 


(3.19) 


where  ^ ^  is  the  first  row  of  the  matrix  6  ,  defined  by  (3.2).  If  we 

are  testing  the  goodness  of  fit  of  a  hypothetical  function  a/x  ,  we  are 
testing  the  hypothesis: 

H:  p  ji  0  ,  p  »  •••  »>  p  m  o  and  a'x  is  the  first  true 

1  c  P  - 

canonical  variate,  i.e.,  x*  -  a’x  (3.20) 

But  xj  «  ~(i)£  (3-2)  and  so,  if  H  is  true,  and  are  identical 

and  hence  we  can  use  y 2  given  by  (3. .18)  and  y3  given  by  (3.19),  with 
replaced  by  a  for  testing  the  "direction”  asoect  and  the  "collinearity" 
aspect  of  H.  The  over-all  test  of  H  is  given  by  y^  +  y^  and  y^  ,  y^  are 
the  direction  and  collinearity  parts  of  tr  B(A  +  B)  1  .  The  latter  can  be 
justified  by  an  argument  similar  to  the  one  employed  by  the  author  elsewhere 
(1961b) ,  for  testing  the  goodness  of  fit  of  a  hypothetical  principal  com¬ 
ponent. 

In  exactly  a  similar  manner,  we  can  show  that,  for  the  ocher  criterion 
tr  BA  1  ,  the  partitioning  is 

n  tr  BA*1  -  ^  +  ^  +  ^3  »  (3.21) 
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where 


1 

n 


a'Ao 


1  C  .  a-'BA^Bg  _  flL'gfl. 
n  2  a'Ba  a'Aa 


1 

n 


.-1 . 


,  _-l  a' BA  B& 

tr  BA  -  — — 

a  'Ba 


(3.22) 

(3.23) 


(3.24) 


i*  ®  X2  with  (p-1)  d.f.  and  is  a  x2  with  (p~l) (q-1)  d.f.  in  large 
samples  and  these  are  respectively  the  "direction"  and  ".-cl linearity"  parts 
and  can  be  used  to  test  these  aspects  of  the  null  hypothesis  H. 
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H.  Ruben  (1966)  has  suggested  a  simple  approximate  normalisation  for 
the  correlation  coefficient  in  normal  samples,  by  representing  it  as  the 
ratio  of  a  linear  combination  of  a  standard  normal  variable  and  a  chi  variable 
to  an  independent  chi  variable  and  then  using  Fisher's  approximation  to  a 
chi  variable.  This  result  is  extended  in  this  paper  to  a  matrix,  which  in 
a  sense  is  the  correlation  coefficient  between  two  vector  ”ariablas  x  and  y_. 
The  result  is  then  used  to  obtain  large  sample  null  and  nor. -null  (but  in  the 
linear  case)  distributions  of  the  Hotelling-Lawley  criterion  and  the  PiJlai 
criterion  in  multivariato  analysis.  Williams  (1955)  and  Bartlett  (1951)  have 
derived  some  exact  tests  for  the  goodness  of  fit  of  a  single  hypothetical 
function  to  bring  out  adequately  the  entire  relationship  between  two  vectors 
x  and  by  factoriring  Wilks'  A  suitably.  These  factors  are  known  as 
"direction"  and  "collinsarity"  factors,  as  they  refer  to  the  direction  and 
oollinearity  aspects  of  the  null  hypothesis.  In  this  paper,  the  other  two 
criteria  via.  the  Hotelling-Lawley  and  Pillai  criteria  are  partitioned  into 
direction  and  collinearity  parts  and  large  sample  tests  corresponding  to 
them  are  derived  for  tasting  the  goodness  of  fit  of  an  assigned  function. 
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